Journal of Medical Imaging
● SPIE-Intl Soc Optical Eng
Preprints posted in the last 7 days, ranked by how well they match Journal of Medical Imaging's content profile, based on 11 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.
Tejaswi, A.; Fyrdahl, A.; Sigfridsson, A.
Show abstract
Background: Cardiovascular magnetic resonance (CMR) quantification of the left ventricular (LV) volumes and ejection fraction (EF) typically involves manual segmentation of many short axis (SAx) and long axis (LAx) slices of the left ventricle. The scan time and the number of breath holds is proportional to the number of slices. We aimed to evaluate a geometric model of the left ventricle that could enable planimetry from a reduced number of slices. We sought to determine whether acceptable accuracy was retained for evaluating the End Diastolic Volume (EDV), End Systolic Volume (ESV), Stroke Volume (SV), and EF to provide a rapid and reliable clinical alternative. Methods: A cohort of 342 patients, median age: 54 (40 - 65) years, with full-stack CMR examinations was used. Nine geometrical combinations were evaluated: 3, 4 or 5 short axis slices and one of three LAx orientations (2-chamber, 3-chamber or 4-chamber) by retrospectively decimating the full-stack acquisition. LV volumes were calculated as a sum of trapezoidal approximations for apical and mid-cavity slices and a generalized prismoidal model at the base. The accuracy of the volume calculations was quantified against the full-stack reference for the EDV, ESV, SV, and EF using concordance correlation coefficient (CCC), two-way repeated measures ANOVA, pairwise tests, and Bayes factor log10(BF10) analysis. Results: The choice of the long axis (LAx) view was the most influential driver of accuracy (g2 = 0.104, for EDV), approximately 50 times more impactful than the number of SAx slices (g2 = 0.002, for EDV). Volumes calculated using the combination of 2-chamber LAx view and 5 SAx slices had the highest concordance with the full stack (CCC>0.90). While the estimated absolute volumes displayed a systematic negative bias, EF and SV remained highly robust due to bias cancellation. For a 2ch + 5 SAx protocol, EF bias was just 0.83% (LoA: -6.18 to 7.84%), with a minimum detectable change (MDC) of 7.01%, compared to 8.7% reported for expert human readers, suggesting strong concordance. Bayesian paired-samples t-tests yielded log10(BF10) = 6.42 in favor of 5 SAx over 3 SAx, constituting decisive evidence on the Jeffreys scale. The bias and limits of agreement (LoA) for stroke volume and ejection fraction were found to be lower than scan-rescan reproducibility in literature. Conclusion: This reduced-slice geometric model allows for reduced number of breath holds compared to a conventional full-stack CMR acquisition and provides an acceptable accuracy with bias less than scan-rescan variability.
Yang, K.; Shi, P.; Huang, H.; Musio, F.; Baazaoui, H.; Aydin, O. U.; Hilbert, A.; Hamadache, R. E.; Yalcin, C.; Zhang, M.; Falcetta, D.; de la Rosa, E.; Shit, S.; Prabhakar, C.; Wittmann, B.; Rokuss, M. R.; Kirchhoff, Y.; Al-Maskari, R.; Hoeher, L.; Juchler, N.; Casamitjana, A.; Cleary, J.; Schmick, A.; Baumgartner, P.; Deseoe, J.; Vandans, O.; Lee, D.; Oh, K.; LaBella, D.; Mazher, M.; Niederer, S. A.; Qayyum, A.; Liu, Y.; Chen, J.; Kim, W.; Asawalertsak, N.; Kim, M.; Shin, D.; Park, S.-H.; Kikuchi, S.; Zhang, Y.; Liu, J.; Cui, Y.; Qiu, Y.; Verschuur, A.; Zhang, J.; van der Schaaf, I.; Su, R.;
Show abstract
We present the TopBrain 2025 Challenge, the first benchmark for fine-grained multiclass segmentation of the whole brain vasculature in both computed tomography angiography (CTA) and magnetic resonance angiography (MRA). Building on the TopCoW challenge, TopBrain scales vessel annotation from the Circle of Willis to the entire brain, introducing a dataset of 90 annotated volumes across 48 landmark vessel classes spanning arterial and venous systems, of which 50 training volumes are publicly released. Vessel definitions were consolidated from established neuroanatomical references into a unified annotation scheme, and vessel caliber measurements along the centerline are reported for the first time across the whole brain vascular anatomy. To address the unique challenges of multiclass brain vessel segmentation, we propose an evaluation framework that accounts for detection in segmentation performance, assesses anatomical plausibility, and introduces novel contamination metrics that characterize inter-class prediction errors. Fifteen teams from over 220 registered participants submitted algorithms to the benchmark. The top-performing teams built on nnUNet with principled system design choices, achieving around 80% Dice scores, near-zero invalid neighbor counts, over 60% F1 scores for side-road vessels, and below 18% foreground contamination ratio. Larger vessels are easier to segment, while smaller and more complex vessels remain the true bottleneck. The annotated datasets and podium-finish algorithms are made publicly available on Zenodo.
Sharma, R.; Beeche, C.; Dong, J.; Zhuang, R.; Qu, H.; Zhang, R.; Gangaram, V.; Goswami, P.; Xin, J.; Ballard, J.; Goldberg, A.; Sagreiya, H.; Long, Q.; Chen, T.; Witschey, W. R.
Show abstract
The surge in medical imaging has spurred the development of vision-language models (VLMs) to alleviate radiologist workloads. However, clinical deployment is hindered by the lack of meaningful evaluation frameworks. Current metrics - ranging from semantic similarity to large language model (LLM) based judges - often fail to distinguish between clinically trivial and critical discrepancies, poorly reflecting real-world clinical judgment. To address this, we introduce DISCERN (Discordance and Significance-aware Entity-level Radiology Report Comparison). DISCERN is a significance-aware framework that weighs report errors based on their potential impact on patient care. Our results demonstrate that DISCERN powered by closed source LLMs aligns more closely with expert radiologist assessments than traditional metrics or current LLM evaluators, providing a more interpretable and clinically relevant benchmark. By modeling radiologist prioritization and entity-level feedback, DISCERN facilitates targeted model refinement and ensures the safer integration of generative AI into clinical workflows.
Romanov, M.; Kireev, M.; Didur, M.; Cherednichenko, D.; Korotkov, A.; Valdes-Sosa, P.; Fan, Q.; Wang, Q.
Show abstract
One of the prominent methods in neuroimaging data processing is SSM-PCA, which is based on principal component analysis and allows for the identification of diagnostically significant patterns in the form of statistical maps. We developed software, PIE Toolbox, employs SSM-PCA and classification based on the obtained diagnostic patterns revealed from functional and structural tomographic brain imaging. The program supports the entire analysis pipeline including preprocessing of brain images, diagnostic patterns extraction, building classification models, and prediction based on them. The resulting diagnostic patterns are weighted principal components obtained through SSM-PCA, or their linear combinations. PIE Toolbox allows selection of relevant structural and functional brain patterns, computation of their expression values in regions of interest, classification using support vector machines, and evaluation of model performance via cross-validation. This approach enables the use of patterns as features of intergroup differences for individual diagnosis. The software has been validated on both simulated and ADNI datasets.
Hameed, S.; Henry, K.; Jiang, F.; Bhusal, B.; Dillenbeck, H.; Gakenheimer-Smith, L.; Webster, G.; Golestani Rad, L.
Show abstract
Pediatric patients with cardiac implantable electronic devices (CIEDs) face limited MRI access due to RF-induced heating, and computational modeling is increasingly used to characterize this risk. The validity of these simulations, however, depends on pairing body models with clinically realistic lead configurations, guidance that is currently lacking. We retrospectively analyzed 302 CIED surgeries in 281 pediatric patients to derive weight-based constraints for simulation design. Weight alone discriminated epicardial from endocardial lead implantation with AUC = 0.90, and adding age and height yielded no improvement, supporting weight as a sufficient single-parameter selection metric. The probabilistic crossover between approaches occurred at 44~kg, substantially higher than the 10 to 15~kg threshold commonly cited in the literature, with a broad transition zone of 21 to 66~kg in which both lead types were routinely used. Lead length was likewise weight-constrained: only 25~cm leads were observed in patients below 6~kg, and leads of 45~cm or longer were uncommon below 50~kg. These findings yield a three-tier framework, with epicardial-only configurations below 21~kg, dual configurations within 21 to 66~kg, and weight-thresholded lead lengths throughout, enabling MRI safety simulations to focus on clinically realizable anatomy and device combinations.
Yang, J.; Li, L.; Cao, J.; Zhang, J.
Show abstract
Objective:This study aims to compare the advantages and disadvantages of DLIR and adaptive statistical iterative reconstruction-V (ASIR-V) in thin-slice (2.5 mm) CT images of hepatic lesions characterized by high and low contrast. Additionally, the study seeks to determine the optimal DLIR strength for the evaluation of liver lesions. Methods:A retrospective analysis was performed on 90 patients who underwent abdominal contrast-enhanced CT scans. Group A comprised 48 patients with low-contrast lesions, while Group B included 42 patients with high-contrast lesions. The acquired images were reconstructed using post-processing DLIR at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) strengths, all with a slice thickness of 2.5 mm (subgroups A1-A3, B1-B3). Furthermore, images were reconstructed with ASIR-V at 50% strength at slice thicknesses of 2.5 mm and 5 mm (subgroups A4/B4 and A5/B5, respectively). CT values and standard deviations (SD) of the liver and lesions were measured, and the corresponding signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) were calculated. The edge rise slope (ERS) was determined using ImageJ software by measuring CT values along a line from the liver parenchyma to the lesion. Objective metrics were compared using one-way ANOVA, with independent samples t-tests applied for inter-group differences. Subjective scoring, which encompassed noise level, diagnostic confidence, and lesion margin delineation, was conducted by two radiologists, with differences analyzed using the Kappa test. Results: Objective evaluation revealed a progressive decrease in lesion SD and a progressive increase in SNR and CNR from subgroups A1/B1 to A3/B3. The SD of Group A2 decreased by 57.4% compared to A4, while the SNR and CNR of A2 icreased by 19.3% and 24.6% compared to A4. Although subgroup B2 had a lower SNR than B5, the difference was not statistically significant. SNR and CNR in B2 increased by 24.1% and 11.9%, respectively, compared to B4. ERS gradually decreased from A1/B1 to A3/B3. ERS values in A2 and B2 increased by 27.0% and 39.4%, respectively, relative to A5 and B5. Although A3 had a lower ERS than A1 and A2, all DLIR subgroups exhibited higher ERS than A5; similar trends were observed in Group B. Subjective evaluation indicated good inter-reader agreement (Kappa > 0.61, p < 0.05). As DLIR strength increased, noise scores rose progressively in both groups. However, noise in A2 and B2 was lower than in A4/A5 and B4/B5. Diagnostic confidence and lesion margin delineation scores were highest in A2 and B2, while all subjective scores were lowest in A5 and B5. Discussion: Most prior studies evaluated the liver, vessels, or confirmed that image quality can be guaranteed at low doses. However, there are few studies on specific individual lesions. Therefore, this study aims to investigate specific individual lesions. The details and detection rate were analyzed separately to confirm the clinical acceptability of 2.5-mm DLIR image in different contrast lesions. Conclusion: For both high- and low-contrast hepatic lesions, DLIR provides superior image quality compared to ASIR-V, with the 2.5mm DLIR-M setting being optimal. DLIR-M reduces image noise, improves spatial resolution, and produces images more suitable for diagnostic purposes.
Low, Z. X. B.; Rowsthorn, E.; Nazem-Zadeh, M.-R.; Francis, M.; Robb, C.; Howcroft, M.; Whiriskey, R.; Brodtmann, A.; McNeil, J. J.; Law, M.
Show abstract
We trained a self-configuring nnU-Net model for CMB segmentation in a heterogeneous multicenter sample (n=264), including 1.5T and 3T field strengths, SWI and T2*-GRE sequences, and community and clinical cohorts. Model performance was evaluated using 5-fold cross-validation with a focus on object-level detection metrics. Real-world performance was evaluated on scans from an unseen dataset of people with cerebrovascular disease (n=20). The model achieved 0.82 cluster Dice, 0.88 precision, and 0.77 sensitivity on hold-out test data. Notably, the model demonstrated a low false-positive rate, averaging 0.58 false positives (FPs) per scan, an improvement on existing publicly available models. The model achieved high performance in dataset of those with Alzheimer's disease and mild cognitive impairment (0.89 cluster Dice, 0.94 sensitivity), supporting its utility in clinical settings where ARIA-H monitoring is critical. In external validation, the model maintained high robustness with 0.79 sensitivity and 0.95 FPs per scan. By leveraging a heterogenous training strategy and a self-adapting architecture, we demonstrate that deep learning can achieve high-precision CMB detection that is robust to domain shifts. The low FP rate suggests this publicly available pipeline is suitable for automated screening and lesion counting in heterogenous large-scale clinical trials, reducing the burden of manual quantification.
Hofmeister, J.; Brina, O.; Rosi, A.; Bernava, G.; Reymond, P.; Muster, M.; Lovblad, K.-O.; Machi, P.
Show abstract
Background: Three-dimensional visualization and quantitative analysis of cerebral arteries on 3DRA are central to endovascular treatment planning, device selection, and cerebrovascular research. Manual segmentation is time-consuming and operator-dependent, yet no open-source deep learning model has been prospectively validated for this task on 3DRA. Methods: A nnUNet v2 model was trained for binary cerebral artery segmentation on 400 consecutive 3DRA acquisitions from three angiographic systems, comparing four configurations across architectures and loss functions. The best-performing configurations were prospectively validated on 40 patients using a dual approach: quantitative metrics (DSC, clDice, HD95, ASD, Precision, Recall), and blinded expert qualitative evaluation by two interventional neuroradiologists assessing 12 arterial segments, a global quality score, and clinical usability across 40 test cases. Results: The ensemble model achieved median DSC 0.917, clDice 0.932, and HD95 1.494 mm. Global quality scores were significantly lower for nnUNet v2 than for expert segmentations (median 4 vs 5, p<0.001), but nnUNet v2 segmentations were rated clinically usable in 88-90% of cases versus 95-98% for expert segmentations, without significant difference on the binary usability criterion. A consistent proximal-to-distal quality gradient was identified, with comparable scores at proximal arteries and the largest differences at distal arterial segments. Conclusion: nnUNet v2 with topology-aware training provides clinically usable cerebral artery segmentations on 3DRA, prospectively validated through both quantitative metrics and structured expert qualitative assessment, and represents a reproducible open-source foundation for endovascular and research applications.
Schmidlechner, T.; Stumpo, V.; Jehli, E.; Zerweck, L.; Bellomo, J.; Gönel, M.; Müller, F.; Sebök, M.; Bink, A.; Kulcsar, Z.; Weller, M.; Regli, L.; Fierstra, J.; van Niftrik, C. H. B.
Show abstract
Hypoxia-targeted BOLD MRI is a novel technique, which probes oxygenation physiology in response to a controlled transient hypoxia stimulus. In glioblastoma, the signal response is spatially and temporally heterogeneous. We developed a voxel-wise temporal decomposition framework for hypoxia-targeted BOLD MRI that separates the arrival of responses, transition phases, and steady state during controlled isocapnic hypoxia. Twenty healthy controls underwent 3-T BOLD MRI during a double hypoxic step challenge to establish a normative reference. Three patients with newly diagnosed glioblastoma were included as proof-of-concept cases. For each voxel, we estimated response arrival delay (Delaycorr), delay to plateau, delay to return and an O2-normalized steady-state response (HypoxiaSS). Healthy-control maps were used to construct a voxel-wise normative atlas and, for HypoxiaSS, a global-response-adjusted model for patient deviation mapping. In healthy controls, HypoxiaSS showed lower supratentorial between-subject variabilitythan both whole-stimulus comparators (coefficient of variation: 1.77 versus 2.36 for Hypoxiaavg) and higher voxel-level step-to-step agreement (ICC(2,1): median 0.951 versus 0.792 for Hypoxiaavg). Whole-stimulus averaging exhibited a systematic step-2 signal amplification present in 19 of 20 subjects, which was absent from HypoxiaSS. Asingle global response scalar explained a median 72.5% of voxel-wise between-subject variance in HypoxiaSS. In proof-of-concept patient analyses, G-adjusted HypoxiaSS deviation maps and timing maps identified spatially coherentabnormalities that were partly complementary and extended beyond conventional MRI-defined lesion margins.Temporal decomposition improves the stability and interpretability of hypoxia-targeted BOLD MRI and provides a practical framework for population-referenced physiological mapping and atlas-based deviation mapping in glioblastoma.
Sadikov, A.; Cai, L. T.; Xiao, J.; Yuh, E. L.; Choi, H. L.; Sun, X.; Mac Donald, C. L.; Vassar, M. J.; Diaz-Arrastia, R.; Giacino, J. T.; Okonkwo, D. O.; Robertson, C. S.; Stein, M. B.; Temkin, N.; McCrea, M. A.; Jain, S.; Manley, G. T.; Mukherjee, P.; TRACK-TBI Investigators,
Show abstract
Generalizable neuroimaging biomarkers that detect cerebral cortical changes after traumatic brain injury (TBI) and predict patient outcomes are needed to improve care and to develop targeted therapies. We used morphometric inverse divergence (MIND) analysis of structural MRI to investigate cortical gray matter morphological networks cross-sectionally and longitudinally after TBI and correlate these with symptoms, disability and cognition six months after injury. Our findings support the Triple Network Model from functional MRI of post-traumatic alterations in the relationship between task-positive, default mode and salience networks. However, the strongest associations between early cortical similarity metrics and long-term patient outcomes involved the dorsal attention network and the limbic network as well as similarity metrics across Mesulam's hierarchy of laminar differentiation. Since MIND mapping of cortical gray matter networks only requires data that is a routine part of standard clinical MRI protocols and does not need image harmonization across different scanners, this work reports a promising new tool that is immediately available for advancing research and clinical care in TBI.
Liu, T.; Zeng, X.; Snitz, B. E.; Karikari, T. K.; Deek, R. A.
Show abstract
Blood biomarker models are increasingly used in Alzheimer's disease and related dementia translational research, but predictive performance can be inflated when the same dataset is used for both model development and evaluation. We assess the effect of data double dipping using simulations and NULISA proteomic data from the MYHAT-NI community-based cohort to predict brain amyloid-beta neuroimaging status. In both settings, training AUC increased as more biomarkers were added, while testing AUC peaked earlier and then declined. These findings show that data double dipping can inflate model performance and highlight the need for external validation or internal validation with data partitioning.
Xie, M.; Zhou, Y.; Li, H.; Xie, Y.; Yan, X.
Show abstract
Background: The specific 3D morphological substrates distinguishing the newly defined massive and torrential functional tricuspid regurgitation (FTR) phenotypes from standard severe disease remain under-characterized. Objectives: This study investigates the 3D geometric changes of the tricuspid valve (TV) apparatus across the spectrum of FTR, specifically focusing on the structural definition of massive and torrential grades. Methods: Three-dimensional (3D) transesophageal echocardiography (TEE) was performed in 322 patients with FTR secondary to left-sided heart disease. Patients were stratified into mild-moderate (n=166), severe (n=82), and massive-torrential (n=74) groups. TV geometry, including annular dimensions, leaflet tethering, and subvalvular apparatus, was quantified using 3D modeling software. Results: Patients with massive-torrential TR were characterized by advanced age, female predominance, and atrial fibrillation (75%). 3D analysis demonstrated that massive-torrential TR represents a distinct phenotype defined by extreme annular circularization (ellipticity index 1.0) and planar flattening (P < 0.001). Furthermore, these patients exhibited a critical leaflet-annulus uncoupling, where compensatory leaflet growth (relative length < 80%) failed to match the massive annular dilation. Consequently, the regurgitant orifice in massive-torrential grades appeared highly complex, frequently manifesting as multiple irregular orifices. Conclusions: Massive and torrential FTR are characterized by a unique geometric profile involving extreme annular circularization, severe leaflet tethering, and leaflet-annulus uncoupling. These morphological insights suggest that conventional repair strategies may be insufficient for these advanced phenotypes, highlighting the necessity for pre-procedural 3D TEE to guide device selection.
Doucette, M.; Zhang, Y.; Liao, C.-Y.; Lin, M.-H.; Yan, Y.; Dess, R. T.; Tendulkar, R. D.; Garant, A.; Hannan, R.; Jiang, S.; Nguyen, D.; Desai, N.; Yang, D. X.
Show abstract
Our study evaluated whether a deep learning auto segmentation model combined with machine learning triage can streamline radiotherapy clinical trial quality assurance (QA). We analyzed 107 stereotactic ablative radiotherapy (SABR) cases from a multi-institutional phase II clinical trial of neurovascular sparing prostate SABR, focusing on physician contours of the internal pudendal artery (IPA) as a novel organ-at-risk with substantial interobserver variability. Contours were scored by the trial principal investigator as Per-Protocol or Minor Deviation/Unacceptable. We applied a deep learning model for IPA auto-segmentation. Agreement between human and AI contours was then quantified using 14 overlap, distance, and surface metrics, and a supervised classifier was trained on these metrics to flag clinical trial protocol deviations. While AI segmentation achieved only modest geometric accuracy with mean Dice similarity coefficient of 0.446 and 95th percentile Hausdorff distance of 14.23, when incorporating all 14 metrics, a machine learning classifier yielded AUROC of 0.836, flagging all Minor Deviation/Unacceptable cases with 100% sensitivity on the 27 case hold-out set with 6 false positives and no false negatives. AI segmentation combined with metrics-based machine learning can triage protocol deviations within a multi-institution radiotherapy clinical trial, supporting prospective evaluation of AI-assisted trial QA.
Hett, K.; Dubois, A.; Bonitz, I.; Considine, C. M.; Eaton, J.; Mcknight, C. D.; Claassen, D. O.; Donahue, M. J. J.; Trujillo, P.
Show abstract
Purpose. The choroid plexus (ChP) is the primary source of cerebrospinal fluid and an emerging marker of cerebral health, with enlargement and hypoperfusion reported in aging and neurodegeneration. However, frequent ChP calcifications can confound volumetric and perfusion measures. Although computed tomography (CT) is the gold standard for detecting calcification, it is rarely available in research MRI. Quantitative susceptibility mapping (QSM) offers an alternative sensitive to diamagnetic mineralization but lacks validated susceptibility thresholds. Method. Participants underwent CT and MRI within four weeks, including 3D T1-weighted and a multi-echo gradient echo QSM MRI. ChP calcifications were identified on CT using standard diagnostic criteria. Using the Bayes decision boundary framework, we identified optimal susceptibility thresholds for detecting diamagnetic signals consistent with calcification and compared these thresholds with multiple density levels measured on gold standard CT images. Results. Across all participants (n=20; age=62.2+-12.0 yrs), the optimal susceptibility threshold separating background ChP signal from calcifications was -0.10 ppm at 60 HU (low-density) and -0.15 ppm at 100 HU (high-density). Susceptibility values within calcified tissue exhibited a linear relationship with CT-derived tissue density. A significant positive association was observed between ChP volume and calcification volume among participants with detectable calcification (beta=2.26, p=0.047). Conclusion. This work should provide a practical framework for quantifying ChP calcifications routinely from MRI. The observed relationship between ChP volume and calcification volume highlights the importance of accounting for calcified tissue, particularly when calcification burden is substantial, when investigating ChP abnormalities in aging and neurodegenerative disease.
Hagan, J.
Show abstract
Background. Cross-validation (CV) is widely used to estimate predictive performance, but can overestimate performance when applied at the observation level to repeated-measures data. When continuous predictor variables are measured repeatedly within subjects and the binary outcome is defined at the subject level, naive observation-level CV introduces data leakage through within-subject dependence, producing optimistically biased estimates of the area under the receiver operating characteristic curve (AUROC). The magnitude of this bias and the performance of alternative partitioning strategies have not been formally characterized for this data structure. Methods. Three CV strategies were compared for estimating subject-level AUROC in ridge logistic regression models: naive observation-level 10-fold CV, subject-level 10-fold CV, and leave-one-cluster-out (LOCO) CV. The framework was applied to a motivating clinical dataset of daily oxygenation measures and retinopathy of prematurity outcomes among 101 extremely low birth weight infants. A factorial simulation study was conducted across 162 parameter combinations varying cluster count (20-150), intraclass correlation (0.1-0.5), within-cluster autocorrelation (0.2-0.8), and outcome prevalence (10-35%), with 500 simulated datasets per condition (76,389 valid datasets total). Results. In the motivating dataset, naive CV produced optimism of +0.078 AUROC units for severe ROP prediction (15 events, 101 subjects) and +0.031 for any ROP prediction (48 events). Subject-level 10-fold CV closely approximated LOCO (deviation [≤] 0.015). In the simulation, naive CV optimism ranged from +0.039 to +0.204 across all conditions, increasing monotonically with higher ICC, higher autocorrelation, fewer clusters, and lower event rates. Subject-level 10-fold CV was essentially unbiased relative to LOCO across all 162 conditions (mean absolute deviation = 0.002). Conclusions. Naive observation-level CV meaningfully overestimates discriminative performance in the repeated-measures binary outcome setting and should not be used. Subject-level CV partitioning effectively eliminates this bias. Accordingly, subject-level partitioning should be considered essential, not optional, when validating prediction models using repeated-measures data with subject-level outcomes.
Kwon, W.-A.; Park, S.; Kim, R.; Lee, W.; Park, C.; Kim, T.-S.; Joung, J. Y.
Show abstract
Background: Prostate-specific membrane antigen (PSMA) PET/CT is central to prostate cancer staging and theranostic workflows. To our knowledge, no direct within-patient comparison of [18F]FC303 ([18F]Florastamin) and [68Ga]Ga-PSMA-11 has been reported. We performed a preliminary paired method-comparison study under non-harmonized acquisition protocols. Patients and Methods: Twenty patients with histologically confirmed prostate cancer underwent [68Ga]Ga-PSMA-11 PET/CT (185 +/- 37 MBq, 60 +/- 10 min) followed by [18F]FC303 PET/CT (370 +/- 37 MBq, 105 +/- 15 min) on the same PET/CT system within each patient (median interval, 29.5 days). Index targets were anatomically matched to the biopsied or surgically sampled lesion or target region. The primary malignant set included 18 histologically malignant targets; two histology-negative or indeterminate targets were included only in sensitivity analysis. Fixed [68Ga]Ga-PSMA-11-first scan order and the 45-min uptake-time difference were central interpretive constraints. Results: Across five predefined reference organs, [18F]FC303 showed lower SUVmean than [68Ga]Ga-PSMA-11 (all Benjamini-Hochberg-adjusted p < 0.001; [68Ga]/[18F]FC303 geometric mean ratio [GMR], 1.29-3.89). In the primary malignant set, [18F]FC303 lesion SUVmax was lower than [68Ga]Ga-PSMA-11 (median, 11.3 vs 18.1; paired median difference, -5.50; 95% CI, -6.85 to -2.90; Wilcoxon p = 8.4 x 10-4), with strong rank correlation (Spearman {rho} = 0.90). Passing-Bablok regression yielded {beta} = 1.13 (95% CI, 1.04-1.45), and log-Bland-Altman GMR (FC303/[68Ga]) was 0.75, consistent with proportional non-interchangeability. Tumor-to-liver and tumor-to-mediastinum ratios did not differ significantly (GMR, 1.17 [95% CI, 0.94-1.45] and 0.96 [0.80-1.15], respectively); the study was not powered for equivalence. The n = 20 sensitivity analysis showed consistent directionality. Conclusions: Under non-harmonized acquisition conditions, [18F]FC303 showed lower physiologic reference-organ SUVmean and malignant target-region SUVmax than [68Ga]Ga-PSMA-11, whereas tumor-to-liver and tumor-to-mediastinum ratios were not significantly different. Absolute SUVs were not interchangeable; [68Ga]Ga-PSMA-11-derived SUV thresholds should not be directly transferred to [18F]FC303 without tracer-specific calibration.
Rezaeitaleshmahalleh, M.; Masoumi, S.; Debalme, E.; Sundt, T. M.; Aranki, S. F.; Shin, B.; Nezami, F. R.
Show abstract
Background: Coronary artery bypass grafting (CABG) remains the standard of care for complex multivessel and left main coronary artery disease. However, current preoperative planning remains largely subjective, relying on qualitative interpretation of coronary CT angiography (CCTA), operator-dependent stenosis grading, and fragmented multi-software workflows. Invasive fractional flow reserve (FFR), the reference standard for physiologic lesion assessment, is infrequently acquired preoperatively, leaving distal anastomosis planning without an objective hemodynamic basis. Methods: We developed a fully automated, AI-powered platform that converts routine CCTA into a patient-specific CABG planning workflow through five integrated modules: nnU-Net based segmentation of coronary lumen and calcification; quantitative morphological and topological characterization generating more than thirty descriptors; automated stenosis detection using a local reference-radius formulation; a nine-point composite scoring framework for distal anastomosis site selection incorporating luminal caliber, landing-zone length, calcification burden, distal perfusion reserve, and bifurcation proximity; and interactive virtual graft construction coupled to a distributed reduced-order solver for pre- and post-bypass FFR estimation. Results: Lumen segmentation achieved a mean Dice similarity coefficient of 0.96 {+/-} 0.01, whereas calcium segmentation achieved 0.73 {+/-} 0.15 on the held-out cohort. Platform-derived FFR demonstrated strong agreement with invasively measured FFR (r=0.96, mean absolute relative difference 1.73 {+/-}1.42%) across the evaluated lesions, supporting the physiologic validity of the reduced-order hemodynamic solver. End-to-end analysis from raw CCTA to hemodynamic assessment and virtual graft planning was completed in approximately seven minutes per case on a standard workstation, representing a substantial reduction in processing time compared with conventional multi-tool and CFD-based workflows. Conclusions: The proposed platform demonstrates the feasibility of rapid, reproducible, and physiology-informed CABG planning using routine CCTA. By integrating anatomical characterization, automated target-site analysis, virtual graft construction, and reduced-order hemodynamic assessment into a single workflow, the framework provides objective, quantitative surgical decision support compatible with routine clinical workflows. Keywords: Coronary artery bypass grafting (CABG); Fractional flow reserve (FFR); Coronary CT angiography (CCTA); Surgical planning
McBride, F.; Huang, H.; Kapoor, A. K.; Oermann, E.; Frontera, J. A.; Razavian, N.
Show abstract
Background and Purpose Prognostication after acute ischemic stroke often relies on limited variables and simple risk scores, despite richer information being available at admission. We developed a multimodal AI model using admission data to predict modified Rankin Scale (mRS) outcomes and compared it to established tools. Methods In a retrospective study of ischemic stroke/TIA patients, we trained three modality-specific models on admission non-contrast head CT, history and physical notes, and structured clinical variables, and combined them in a weighted-average ensemble. We predicted binary (mRS 0-2 versus 3-6) and ordinal mRS (0-6) outcomes at discharge and 90 days. Performance on an external test cohort was compared with THRIVE and SPAN-100 scores using AUROC, AUPRC, Brier score, mean absolute error (MAE), and quadratic weighted kappa (QWK). Results A total of 6,915 patients were split into training, validation and testing cohorts in a 3:1:1 ratio. For discharge binary mRS (n=1596), the multimodal ensemble achieved significantly better discrimination (AUROC 0.859, AUPRC 0.858) with 25-61% lower Brier scores than THRIVE or SPAN?100 (all p<0.001). For 90?day binary mRS (n=207), the model also outperformed both THRIVE and SPAN-100 (AUROC 0.838, AUPRC 0.805, with 3-38% lower Brier scores). Ordinal mRS prediction showed similarly strong performance with significantly better QWK at discharge and numerically lower MAE. The multimodal ensemble model reassigned about one?third of patients to different risk categories versus THRIVE and was closer to the true discharge outcome in ~74% of discordant cases. Conclusions We developed a well-calibrated multimodal AI model for prediction of discharge and 90-day post-stroke functional outcomes using only data present at the time of admission. This model outperforms existing prognostic tools and can support early clinical decision-making.
Cresson, J.; Pere, M.; Szafranska, A.
Show abstract
This work focuses on the global and partial identification problem for fractional differential equations. We provide a general numerical procedure based on global and local optimization algorithms with two refinements for biological systems that ensure solution positivity and homogeneous parameter units. The method is applied to a new fractional model of Dengue outbreak called the Fractional Homogeneous Nishiura (FHN) model, calibrated using data of newly infected people in Cape Verde. We show that our identification method yields a better fit between data and model solutions than previous approaches and that our FHN model captures the dynamics of Dengue more closely than existing systems.
Leppert, I. R.; Benbachir, A.; Campbell, J. S.; Coelho, S.; Feizollah, S.; Nelson, M. C.; Brais, B.; Cocozza, S.; Pike, G. B.; La Piana, R.; Tardif, C. L.
Show abstract
Background: Autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS) is a genetic disease characterized by spasticity and ataxia which reflects involvement of the corticospinal tracts (CST) and cerebellum. The primary involvement of the middle cerebellar peduncles (MCP) and transverse pontine fibers (TPF) at the crossing with the CST, and their role in the pathophysiology of the disease, is currently debated. Objectives: Advanced MRI techniques capable of isolating sub-voxel microstructural parameters can test the hypothesis that the MCP and TPF are abnormally large, compressing the CST at their crossing, and potentially impairing CST development. Methods: Tract macro- and micro-structural properties, including axon and tract caliber, axon density and geometry, and myelin content were estimated from diffusion-relaxometry and magnetization transfer imaging. These features were analyzed along segments of the CST, MCP, and TPF of 9 patients and 9 age-matched controls. Results: While the CST showed significant decreases in tract size, axon caliber, and myelination throughout its length compared to controls (p<0.01), the MCP and TPF were relatively unaffected. In our group, neither the MCP nor the pons were enlarged. The proximal MCP showed an increase in axon caliber. Conclusions: The increase in fractional anisotropy and axon density towards the center of the TPF could be driven by geometric confounds related to differences in the relative sizes of the CST and TPF compared to controls. This highlights the importance of investigating tract-specific microstructural profiles, particularly in regions of geometric complexity. The findings confirm the involvement of the CST, with a relatively limited involvement of the MCP and TPF.